Characterization of Multithreaded Scientific Workloads on Simultaneous Multithreading Intel Processors
نویسندگان
چکیده
Simultaneous Multithreading (SMT) is a technique that allows multiple independent threads to execute different instructions each cycle. Hyper-Threading (HT) is an implementation of SMT available on recent processors from Intel. Naturally, Multi-threaded applications are very suitable for SMT systems. However, HT due to extensive resource sharing may not suitably benefit OpenMP high performance computing applications. In this paper, we first present performance of different OpenMP constructs on dual and quad HT-based Intel Xeon servers under 2.4.22 and 2.6.9 kernels. We find that the overhead of OpenMP constructs with HT is an order of magnitude larger than when HT is off. We then use a range of OpenMP applications from the NAS and SPEC OMPM2001 suites to measure performance on a dual Hyper-Threaded SMP. Our performance results indicate majority of applications benefit from having a second thread in one-processor situations. However, only a few applications enjoy performance gain when HT is enabled on both processors. Data from hardware performance counters verifies trace cache misses and its delivery rate are sources of performance bottleneck.
منابع مشابه
Integrating Multiple Forms of Multithreaded Execution on SMT Processors: A Quantitative Study with Scientific Workloads
Simultaneous multithreaded (SMT) processors have penetrated the mainstream computing market, since they offer a number of cost / performance advantages over conventional superscalar processors at a nominal additional cost. Simultaneous multithreading can be used in the execution engine of a single monolithic microprocessor, or be embedded and replicated in the execution cores of a chip multipro...
متن کاملPerformance Evaluation of Intel's Quad Core Processors for Embedded Applications
Recently, multiprocessing is implemented using either chip multiprocessing (CMP) or Simultaneous multithreading (SMT). Multi-core processors, represent CMP processors, are widely used in desktop and server applications and are now appearing in real-time embedded applications. We are investigating optimal configurations of some of the available multi-core processors suitable for developing real-...
متن کاملComparing the Energy Efficiency of CMP and SMT Architectures for Multimedia Workloads
Chip multiprocessing (CMP) and simultaneous multithreading (SMT) are two recently adopted techniques for improving the throughput of general-purpose processors by using multithreading. These techniques are likely to benefit the increasingly important real-time multimedia workloads, which are inherently multithreaded. These workloads, however, often run in an energy constrained environment. This...
متن کاملMultithreaded Processors
The instruction-level parallelism found in a conventional instruction stream is limited. Studies have shown the limits of processor utilization even for today's superscalar microprocessors. One solution is the additional utilization of more coarse-grained parallelism. The main approaches are the (single) chip multiprocessor and the multithreaded processor which optimize the throughput of multip...
متن کاملSpeculative Precomputation
Current processors are based on a multithreaded architecture. Simultaneous Multithreading (SMT) techniques are used to increase instruction throughput under a multiprogramming workload; however, it does not improve performance when only a single thread is executing. This communication explores Speculative Precomputation, a technique that uses idle thread contexts in a multithreaded architecture...
متن کامل